Skip to content

Submit delete batches in parallel#3123

Merged
poodlewars merged 5 commits into
masterfrom
deletion-cleanup
Jun 4, 2026
Merged

Submit delete batches in parallel#3123
poodlewars merged 5 commits into
masterfrom
deletion-cleanup

Conversation

@poodlewars
Copy link
Copy Markdown
Collaborator

@poodlewars poodlewars commented May 19, 2026

Discovered while working on delayed deletes. I created a library on PURE with 1000 symbols, with ~200 versions each and 10 snapshots.

Then running delayed deletes had these results:

Side Elapsed (s) Peak RSS (MiB)
arcticdb HEAD (47bba8011, parallel batches) 349.9648 780.3
arcticdb 2d1e60e (pre-batching) 399.6425 792.9
arcticc 333.3081 421.3

Notably running delayed deletes in "dry run" mode (which skips the physical deletion) showed no performance difference between arcticc and ArcticDB, for example on a small example:

Side Mode Elapsed (s) Peak RSS (MiB)
arcticdb dry-run 6.4449 388.2
arcticdb real 19.2177 427.9
arcticc dry-run 8.6100 329.0
arcticc real 11.2915 330.4

arcticc submitted RemoveBatchTask in chunks of 1000 https://mangit.maninvestments.com/projects/DATA/repos/arcticc/browse/arcticcxx_impl/async/async_store.hpp#238 but #70 changed it to this implementation.

If we delete 10k keys, the old implementation in arcticc would submit 10 tasks to delete 1000 keys in parallel, whereas ArcticDB currently submits 10 HTTP requests to delete 1000 keys in serial.

This explains most of the performance difference between arcticc delayed deletes and enterprise delayed deletes.

To stop storage specific deletion batch size limits leaking out of the storage layer, this PR has each storage declare its maximum deletion batch size, which is then used by the AsyncStore when it calculates the batches.

@poodlewars poodlewars added patch Small change, should increase patch version performance labels May 19, 2026
@poodlewars poodlewars marked this pull request as ready for review May 19, 2026 15:14
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 19, 2026

ArcticDB Code Review Summary

Latest commit 7651360 ("Code review comments") cleanly addresses prior feedback: key_types was renamed to unique_key_types, the RemoveKeyResultType typedef was removed in favour of void/folly::Unit returns (verified no stale references remain), and the bulk-delete test now uses config_context_multi. No new issues introduced.

Build and Dependencies

  • FIXED in 31f99ea: cpp/arcticdb/entity/test/test_variant_key.cpp added with unit tests for the key-type helper (renamed to unique_key_types in 7651360). CMakeLists.txt entry resolves.

API and Compatibility

  • The pre-existing Storage.DeleteBatchSize config key (previously read in cpp/arcticdb/util/key_utils.hpp) is replaced in this PR by Storage.DeletePendingBufferSize (different semantics: accumulator/flush threshold rather than per-request batch size). The actual storage batch size is now controlled by the existing S3Storage.DeleteBatchSize / AzureStorage.DeleteBatchSize configs via the new max_delete_batch_size() path. Any externally set Storage.DeleteBatchSize will now be silently ignored - preserve backward compatibility, surface a warning, or call out this rename in the description / release notes.

Documentation

  • Add a release note for this behaviour change. The patch and performance labels are set but no-release-notes is not, so users should be informed that delete throughput improves and that the Storage.DeleteBatchSize config key is no longer honoured.

@poodlewars poodlewars force-pushed the deletion-cleanup branch 2 times, most recently from 9852aeb to 0614e82 Compare May 19, 2026 16:34
Comment thread cpp/arcticdb/storage/s3/nfs_backed_storage.cpp Outdated
Comment thread cpp/arcticdb/CMakeLists.txt
Comment thread cpp/arcticdb/entity/variant_key.hpp Outdated
Comment thread cpp/arcticdb/storage/mongo/mongo_storage.cpp
Comment thread cpp/arcticdb/storage/s3/s3_storage.hpp
Comment thread cpp/arcticdb/storage/s3/detail-inl.hpp
Comment thread cpp/arcticdb/storage/s3/detail-inl.hpp
Comment thread cpp/arcticdb/async/async_store.hpp
);
return folly::collect(std::move(futs)).via(&async::io_executor()).thenValue([](auto&&) {
return std::vector<RemoveKeyResultType>{};
});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/man-group/ArcticDB/blob/master/cpp/arcticdb/stream/stream_sink.hpp#L49-L54
Don't have to, but if you want to rip out this pointless abstraction while you're here I won't object

keys = {};
keys.reserve(flush_threshold);
const auto batch_size = batch.size();
store->remove_keys(std::move(batch)).get();
Copy link
Copy Markdown
Collaborator

@alexowens90 alexowens90 May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a slightly more elegant model that would remove the need for flush_threshold . Instead of calling get immediately, we maintain a set of remove_keys futures we've submitted. Folly let's you check if a future is done without calling get() using isReady(), so the size of this collection could be kept bounded. It's quite a bit more complicated though, and probably doesn't add much in this case.
At least worth a comment about why remove_keys is better than remove_keys_sync in this case though.

with (
config_context("S3Storage.DeleteBatchSize", batch_size),
config_context("AzureStorage.DeleteBatchSize", batch_size),
):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a config_context_multi for setting multiple options at once



@pytest.mark.storage
def test_bulk_delete_batching(object_and_mem_and_lmdb_version_store, sym):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite an indirect test of the batching. I get that we don't have query stats for Azure right now, but the test below is much more explicit

@poodlewars poodlewars merged commit 5f661c7 into master Jun 4, 2026
397 of 398 checks passed
@poodlewars poodlewars deleted the deletion-cleanup branch June 4, 2026 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Small change, should increase patch version performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants